187 research outputs found

    A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6

    Get PDF
    Motivation: High-resolution copy-number (CN) analysis has in recent years gained much attention, not only for the purpose of identifying CN aberrations associated with a certain phenotype, but also for identifying CN polymorphisms. In order for such studies to be successful and cost effective, the statistical methods have to be optimized. We propose a single-array preprocessing method for estimating full-resolution total CNs. It is applicable to all Affymetrix genotyping arrays, including the recent ones that also contain non-polymorphic probes. A reference signal is only needed at the last step when calculating relative CNs. Results: As with our method for earlier generations of arrays, this one controls for allelic crosstalk, probe affinities and PCR fragment-length effects. Additionally, it also corrects for probe sequence effects and co-hybridization of fragments digested by multiple enzymes that takes place on the latest chips. We compare our method with Affymetrix's CN5 method and the dChip method by assessing how well they differentiate between various CN states at the full resolution and various amounts of smoothing. Although CRMA v2 is a single-array method, we observe that it performs as well as or better than alternative methods that use data from all arrays for their preprocessing. This shows that it is possible to do online analysis in large-scale projects where additional arrays are introduced over time. Availability: A bounded-memory implementation that can process any number of arrays is available in the open source R package aroma.affymetrix. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6

    Get PDF
    Motivation: High-resolution copy-number (CN) analysis has in recent years gained much attention, not only for the purpose of identifying CN aberrations associated with a certain phenotype, but also for identifying CN polymorphisms. In order for such studies to be successful and cost effective, the statistical methods have to be optimized. We propose a single-array preprocessing method for estimating full-resolution total CNs. It is applicable to all Affymetrix genotyping arrays, including the recent ones that also contain non-polymorphic probes. A reference signal is only needed at the last step when calculating relative CNs

    A simple method for assigning genomic grade to individual breast tumours

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The prognostic value of grading in breast cancer can be increased with microarray technology, but proposed strategies are disadvantaged by the use of specific training data or parallel microscopic grading. Here, we investigate the performance of a method that uses no information outside the breast profile of interest.</p> <p>Results</p> <p>In 251 profiled tumours we optimised a method that achieves grading by comparing rank means for genes predictive of high and low grade biology; a simpler method that allows for truly independent estimation of accuracy. Validation was carried out in 594 patients derived from several independent data sets. We found that accuracy was good: for low grade (G1) tumors 83- 94%, for high grade (G3) tumors 74- 100%. In keeping with aim of improved grading, two groups of intermediate grade (G2) cancers with significantly different outcome could be discriminated.</p> <p>Conclusion</p> <p>This validates the concept of microarray-based grading in breast cancer, and provides a more practical method to achieve it. A simple R script for grading is available in an additional file. Clinical implementation could achieve better estimation of recurrence risk for 40 to 50% of breast cancer patients.</p

    The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures

    Get PDF
    Motivation: Biomarker discovery from high-dimensional data is a crucial problem with enormous applications in biology and medicine. It is also extremely challenging from a statistical viewpoint, but surprisingly few studies have investigated the relative strengths and weaknesses of the plethora of existing feature selection methods. Methods: We compare 32 feature selection methods on 4 public gene expression datasets for breast cancer prognosis, in terms of predictive performance, stability and functional interpretability of the signatures they produce. Results: We observe that the feature selection method has a significant influence on the accuracy, stability and interpretability of signatures. Simple filter methods generally outperform more complex embedded or wrapper methods, and ensemble feature selection has generally no positive effect. Overall a simple Student's t-test seems to provide the best results. Availability: Code and data are publicly available at http://cbio.ensmp.fr/~ahaury/

    Gene expression profiling of breast cancer

    Get PDF
    Molecular types of breast cancer Important differences in the clinical behaviour of oestrogen receptor (ER)-positive and ER-negative cancers have been recognised for a long time [1]. Nevertheless, breast cancer was regarded as a single disease with variable histology and clinical course. More recently, high-throughput analytical methods revealed unexpectedly large-scale molecular differences between ER-positive cancers and ER-negative cancers [2]. These results prompted a conceptual shift in the classification of breast cancer, which is increasingly viewed not as a single disease but as a collection of several biologically distinct neoplastic diseases that arise from the breast epithelium. The different molecular types of breast cancer may originate from different epithelial precursors such as luminal (ERpositive cancers) or basal (ER-negative tumours) epithelia

    Phenotypic diversity of T cells in human primary and metastatic brain tumors revealed by multiomic interrogation.

    Get PDF
    The immune-specialized environment of the healthy brain is tightly regulated to prevent excessive neuroinflammation. However, after cancer development, a tissue-specific conflict between brain-preserving immune suppression and tumor-directed immune activation may ensue. To interrogate potential roles of T cells in this process, we profiled these cells from individuals with primary or metastatic brain cancers via integrated analyses on the single-cell and bulk population levels. Our analysis revealed similarities and differences in T cell biology between individuals, with the most pronounced differences observed in a subgroup of individuals with brain metastasis, characterized by accumulation of CXCL13-expressing CD39 &lt;sup&gt;+&lt;/sup&gt; potentially tumor-reactive T (pTRT) cells. In this subgroup, high pTRT cell abundance was comparable to that in primary lung cancer, whereas all other brain tumors had low levels, similar to primary breast cancer. These findings indicate that T cell-mediated tumor reactivity can occur in certain brain metastases and may inform stratification for treatment with immunotherapy

    Selecting control genes for RT-QPCR using public microarray data

    Get PDF
    Background: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e. g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. Results: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/similar to vpopovic/research/ Conclusion: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable

    Test of four colon cancer risk-scores in formalin fixed paraffin embedded microarray gene expression data.

    Get PDF
    BACKGROUND: Prognosis prediction for resected primary colon cancer is based on the T-stage Node Metastasis (TNM) staging system. We investigated if four well-documented gene expression risk scores can improve patient stratification. METHODS: Microarray-based versions of risk-scores were applied to a large independent cohort of 688 stage II/III tumors from the PETACC-3 trial. Prognostic value for relapse-free survival (RFS), survival after relapse (SAR), and overall survival (OS) was assessed by regression analysis. To assess improvement over a reference, prognostic model was assessed with the area under curve (AUC) of receiver operating characteristic (ROC) curves. All statistical tests were two-sided, except the AUC increase. RESULTS: All four risk scores (RSs) showed a statistically significant association (single-test, P &lt; .0167) with OS or RFS in univariate models, but with HRs below 1.38 per interquartile range. Three scores were predictors of shorter RFS, one of shorter SAR. Each RS could only marginally improve an RFS or OS model with the known factors T-stage, N-stage, and microsatellite instability (MSI) status (AUC gains &lt; 0.025 units). The pairwise interscore discordance was never high (maximal Spearman correlation = 0.563) A combined score showed a trend to higher prognostic value and higher AUC increase for OS (HR = 1.74, 95% confidence interval [CI] = 1.44 to 2.10, P &lt; .001, AUC from 0.6918 to 0.7321) and RFS (HR = 1.56, 95% CI = 1.33 to 1.84, P &lt; .001, AUC from 0.6723 to 0.6945) than any single score. CONCLUSIONS: The four tested gene expression-based risk scores provide prognostic information but contribute only marginally to improving models based on established risk factors. A combination of the risk scores might provide more robust information. Predictors of RFS and SAR might need to be different

    A clinically relevant gene signature in triple negative and basal-like breast cancer

    Get PDF
    Introduction: Current prognostic gene expression profiles for breast cancer mainly reflect proliferation status and are most useful in ER-positive cancers. Triple negative breast cancers (TNBC) are clinically heterogeneous and prognostic markers and biology-based therapies are needed to better treat this disease. Methods: We assembled Affymetrix gene expression data for 579 TNBC and performed unsupervised analysis to define metagenes that distinguish molecular subsets within TNBC. We used n = 394 cases for discovery and n = 185 cases for validation. Sixteen metagenes emerged that identified basal-like, apocrine and claudin-low molecular subtypes, or reflected various non-neoplastic cell populations, including immune cells, blood, adipocytes, stroma, angiogenesis and inflammation within the cancer. The expressions of these metagenes were correlated with survival and multivariate analysis was performed, including routine clinical and pathological variables. Results: Seventy-three percent of TNBC displayed basal-like molecular subtype that correlated with high histological grade and younger age. Survival of basal-like TNBC was not different from non basal-like TNBC. High expression of immune cell metagenes was associated with good and high expression of inflammation and angiogenesis-related metagenes were associated with poor prognosis. A ratio of high B-cell and low IL-8 metagenes identified 32% of TNBC with good prognosis (hazard ratio (HR) 0.37, 95% CI 0.22 to 0.61; P < 0.001) and was the only significant predictor in multivariate analysis including routine clinicopathological variables. Conclusions: We describe a ratio of high B-cell presence and low IL-8 activity as a powerful new prognostic marker for TNBC. Inhibition of the IL-8 pathway also represents an attractive novel therapeutic target for this disease
    corecore